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Abstract. Experimental multi-objective Quantum Control is an emerg- 
ing topic within the broad physics and chemistry applications domain of 
controlling quantum phenomena. This realm offers cutting edge ultrafast 
laser laboratory applications, which pose multiple objectives, noise, and 
possibly constraints on the high-dimensional search. In this study we in- 
troduce the topic of Multi-Observable Quantum Control (MOQC), and 
consider specific systems to be Pareto optimized subject to uncertainty, 
either experimentally or by means of simulated systems. The latter in- 
clude a family of mathematical test-functions with a practical link to 
MOQC experiments, which are introduced here for the first time. We 
investigate the behavior of the multi-objective version of the Covari- 
ance Matrix Adaptation Evolution Strategy (MO-CMA-ES) and assess 
its performance on computer simulations as well as on laboratory closed- 
loop experiments. Overall, we propose a comprehensive study on experi- 
mental evolutionary Pareto optimization in high-dimensional continuous 
domains, draw some practical conclusions concerning the impact of fit- 
ness disturbance on algorithmic behavior, and raise several theoretical 
issues in the broad evolutionary multi-objective context. 

Keywords: Experimental Pareto optimization. Quantum Control experiments, 
robustness to noise, multi-objective evolution strategies, covariance matrix adap- 
tation, diffraction grating. 

1 Introduction 

Quantum Control (QC) |1I2| . sometimes referred to as Optimal Control or Co- 
herent Control, aims at altering the course of quantum dynamics phenomena 
for specific target realizations. There are two main threads within QC, theoret- 
ical and experimental control, as typically encountered in physics. Interest in 
the subject has rapidly increased during the past 10 years, in parallel with the 
technological developments of ultrafast laser pulse shaping capabilities f31 that 
made it possible to bring the dream into experimental fruition. 
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Quantum Control Theory (QCT) |4! aims at manipulating the quantum dy- 
namics of a simulated system by means of an external control field, which typi- 
cally corresponds to a temporal electromagnetic field arising from a laser source. 
Quantum Control Experiments (QCE) ^ consider the realization of QC in the 
laboratory, generally executed by applying evolutionary learning-loops for alter- 
ing the course of quantum dynamics phenomena. Here, the yield, or success-rate, 
is assessed by a physical measurement. The nature of the optimization is funda- 
mentally different than in QCT, due to practical laboratory constraints: limited 
bandwidth, limited fluence, control resolution, proper control basis, etc. 
The optimization of QC systems in the laboratory typically poses many algo- 
rithmic challenges, such as operating with high-dimensionality, noise, control 
constraints, and most importantly in this context, a potentially large num- 
ber of simultaneous objectives. Attractive features of QCE are the extremely 
short duration and low cost of an experiment, in comparison to other real- world 
experimental systems: the duration of a typical QC measurement is 1msec, al- 
lowing a well-averaged single experiment to be recorded in the order of a single 
second. 

Evolutionary Algorithms (EAs) [6] are the most commonly employed rou- 
tines for optimization of QCE systems. This can mostly be attributed to their 
high success-rate in addressing the aforementioned challenges, as reported also 
in other domains of experimental many-parameter systems (see, e.g., |7]). In 
particular, they efficiently treat noisy problems, likely due to the employment of 
large populations as well as to the fact that they do not require any explicit gra- 
dient determination. Furthermore, EAs possess several features which are very 
effective in solving multi-objective (MO) problems, such as being population- 
based algorithms, having diversity generation and preservation mechanisms, etc. 
Evolutionary Multi-Objective Algorithms (EMOA) (see, e.g., [8 9 lOj ) constitute 
popular Pareto optimizers that have been highly successful in treating MO prob- 
lems. 

The list of successful quantum systems controlled in the laboratory by means 
of EAs in physics and chemistry is growing rapidly [2], but the vast major- 
ity address de facto single-objective optimization problems. The topic of multi- 
objective QC, also referred to as Multi-Observable Quantum Control (MOQC), 
considers multiple distinct physical observables, referring to mutually competing 
physical processes. One scenario is a single type of quantum system, where the 
competition may be driven by ratios of controlled ionization or fragmentation 
of the same molecule , versus other scenarios involving several independent 
quantum systems, e.g., fluorescence signals in Optimal Dynamic Discrimination 
(ODD) of similar molecules [12|13j . MOQC has been addressed in various ex- 
perimental systems, predominantly by means of tailored single-objective scalar 
functions (see, e.g., [Mj). Treating MOQC as a Pareto optimization problem has 
been reported only recently, and there is currently a limited number of studies 
on this topic: see |T5] for QCT and [H] for QCE. While the former constituted 
the first theoretical study of Pareto fronts in QC, even without involving a MO 
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algorithmic approach, the latter study is the first reported experimental QC 
work by means of an EMOA, namely the NSGA-II [8]. 

This study considers several MOQC systems, both experimental systems in 
the laboratory as well as simulated systems subject to noisy environments. This 
work aims to present a pioneering study on experimental Pareto optimiza- 
tion in high-dimensional continuous domains (at least n = 80 decision pa- 
rameters). Following the successful application of the Covariance Matrix Adap- 
tation Evolution Strategy (CMA-ES) [16| to single-objective QC systems |17ll8j . 
the current study focuses on the multi-objective version of the CMA-ES (referred 
to in our notation as MO-CMA) [19j as the algorithmic tool. We investigate its 
performance upon treating optimization tasks of both noisy model landscapes 
(e.g., Multi-Sphere) as well as real- world MOQC systems. 

The manuscript is organized as follows. Section [2] will provide some back- 
ground on the study of EMOA under noise, and outline the specific character- 
istics of QCE systems in this context. This will be followed by the description 
of our algorithmic scheme in Section |3] where we shall also discuss the topic of 
single-parent elitist ES behavior in the presence of noise. Section |4] will introduce 
the systems under study. We will report on our practical observations in Section 
[5j and conclude in Section [6] 

2 Uncertain Environments (Noise) 

The presence of uncertainties in environments subject to optimization by EAs 
has been widely studied in recent years. The traditional classes of investigated 
uncertainties typically include noisy objective functions |20| . approximation error 
in the objective function 'ST, the search for robust solutions '22', and dynamic 
environments ||23 . Optimization subject to noisy environments is typically de- 
fined within the topic of Robustness. While the research on single-objective EAs 
under uncertain environments in general, and under noisy objective functions in 
particular, has been widely studied (see, e.g., [20124) ). there is a limited number 
of reported EMOA studies to date. The vast majority of the existing studies con- 
sider the scenario of fitness functions subject to noise, and propose techniques to 
efficiently handle this particular uncertainty. Such studies typically make the as- 
sumption that the fitness values are subject to additive Gaussian noise, denoted 
by Af, with zero mean and finite variance, 

f,{x)^f,{x)+M{0,e}), (1) 

where the perceived i*'* fitness is fi and the ideal fitness is fi. The variance of 
the normal disturbance, is referred to as the noise strength, and is assumed 
to either remain fixed during a run (i.e., additive noise), or to be a multiplica- 
tive factor of the fitness measurement, i.e., e^. ~ fi. Also, the so-called degree 
of overvaluation usually refers to the difference between the perceived fitness 
and the ideal fitness: fi — fi. Other types of noisy models, such as considera- 
tion of uncertainty in the decision parameters to be optimized, have received 
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scarce attention |25l26j . This type of noise, which corresponds to the precision 
of the optimized design and may represent manufacturing error, is of particular 
interest to this study. The fitness values are then modeled as 

,h{x) = f,{x+U{0,ell)). (2) 

Here, since the decision parameters are systematically disturbed, each one of 
them can be controlled only up to a certain degree of accuracy. Moreover, the 
fitness values in this case may be either enhanced or deteriorated, depending 
exclusively upon the nature of the objective function and the manner in which 
the noise propagates through it. Thus, the expected fitness overvaluation or 
undervaluation may be estimated only if the propagation of the noise can be 
derived. We choose to refer here to the difference between the perceived and 
the ideal fitness values stemming from noisy decision parameters as the fitness 
disturbance, i.e., fi — fi ■ 

Regardless of the differences in the modeling, the system still retains inherent 
underlying uncertainty, explicitly revealed by two successive evaluations of the 
same recorded input variables returning two different sets of output values. 



2.1 EMOA in Noisy Environments: Robustness 

Early EMOA work on treatment of noisy objective functions includes the proba- 
bilistic Pareto ranking approach (similar concepts by [27] and [28]), which intro- 
duces a modified selection criterion accounting for the stochasticity of the objec- 
tive function. The concepts of domination dependent lifetime and re-sampling of 
archived solutions was introduced by Biiche et al. in [SS]- Moreover, recent studies 
(see, e.g., [SU]) proposed noise-handling features, as additions to existing EMOA, 
and considered a suite of synthetic bi-criteria landscapes as a testbed. In a recent 
study, Bader and Zitzler [31] provided an important overview on robustness in 
multi-objective optimization. In general terms, multi-objective noise-treatment 
and robustness-accounting are carried out by one of the following schemes |31| : 

1. Replacement of the objective function value by a measure reflecting uncer- 
tainty, e.g., statistical mean, or signal averaging [31] 

2. Introduction of an additional robustness criterion to the search |26|33|34] 

3. Consideration of a tailored robustness constraint, imposing candidate solu- 
tions to satisfy statistical criteria [56 35J 

In what follows, we refer to two specific studies that are directly linked to 
our work. 



Simulated Robustness in Multi-Objective Optimization Deb and Gupta 
|26j , in a pioneering work, introduced systematic disturbance to decision parame- 
ters in Pareto optimization and posed the demand for attaining robust solutions. 
The study shifted the focus from searching for global best Pareto fronts to ro- 
bust Pareto fronts, whose pre-images are solutions that are robust to variable 
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perturbations. However, as the authors concluded, the proposed schemes were 
prone to being impractical in real-world scenarios, as they increased the total 
number of evaluations by factors of ^ 50-100. 

Multi-Objective Experimental Optimization The first reported campaign 
of experimental Pareto optimization was carried out by Knowles and co-workers 
within biological experimental platforms (e.g., |36j . and see |37j for an overview). 
In addition to the successful results on multiple experimental systems, this cam- 
paign led to the subsequent development of the ParEGO, an EMOA specializing 
in Pareto optimization subject to an extremely small budget of measurements 
(see, e.g., [38]). This promising search heuristic was designed for specific de- 
manding experimental conditions, amongst which are 

— low noise levels, i.e., individual experiments practically need not be repeated, 

— locally smooth search landscapes, 

— low-dimensional search spaces (less than 10 decision parameters). 

Note on Elitism versus Robustness It has been pointed out in previous 
studies that elitist selection is an essential component for efficient multi-objective 
optimization (see, e.g., |39I40| ). A common argument is the need to preserve 
the current population's information in the global selection phases of Pareto 
domination followed by secondary criteria. Elitism, at the same time, dictates a 
unique dynamic that when exposed to uncertain environments has the potential 
to deteriorate the quality of the run, suffer from systematic overvaluation, and 
lead to periods of stagnation. The currently employed EMOA, namely the MO- 
CMA, employs an elitist strategy as its algorithmic kernel. Due to its nature, and 
due to the nature of experimental frameworks, we shall also explore theoretical 
studies from the realm of single-objective Evolution Strategies related to this 
study, as outlined in Section [3] 

2.2 QC Systems: Sources of Noise and Uncertainty 

Uncertainty in QCE stems from various sources, and exists at several levels. We 
attribute it to three main factors, in decreasing importance, as we shall explain 
in detail in what follows (compare to [55] as a generic reference): 

(A) Spectral phase noise: uncertainty concerning the decision (input) parame- 
ters; the error in realizing the prescribed parameters in the experimental 
setup 

(B) Observation noise: uncertainty concerning the measurement (output) val- 
ues, originating from detector noise (also known as Johnson-Nyquist noise) 

(C) Environmental drift: Systematic slow deviation in the system values over 
the time span of the entire experiment, e.g., minutes to hours 

(A) The primary component in the current experimental learning loop generat- 
ing uncertainty with highest impact is the process responsible for the construc- 
tion of the laser pulse, which is carried out with a pulse shaper. Unlike standard 
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(A) spectral phase disturbance (B) observation noise 




(C) environmental drift 

Fig. 1. Summary of the three main sources of noise in a typical Quantum Control 
experiment. Compare to [22] as a generic reference. 



modeling in the literature regarding noisy environments, the current framework 
is modeled as subject to additive Gaussian noise on the control variables (i.e., 
the decision parameters to be optimized, or the input), which propagates typ- 
ically in a highly nonlinear manner to the measured values (i.e., the objective 
functions, or the outputs). More explicitly, the control function with spectral 
modulation consists of the spectral amplitude A{uj) and phase 4i{lo) functions, 
which together construct the electric field: 

E{t) = M^) exp{i(j){uj)) exp{-iujt) dwj . (3) 

Most QC processes are highly sensitive to the phase, and phase-only shaping is 
typically sufficient for attaining optimal control. Our experiments only include 
phase modulation, where the spectral function A{uj) is fixed. The latter is well 
approximated by a Gaussian and determines the bandwidth, or the minimal pulse 
duration. Note that shaping the field with phase-only modulation guarantees 
conservation of the pulse energy. 

The spectral phase (p{uj) is defined at n frequencies that are equally 

distributed across the bandwidth of the spectrum. These n values, {^(a;^)}"^;^, 
correspond to the n pixels of the pulse shaper and are the decision parameters 
to be optimized in the experimental learning loop: 

(w) = ((/.(Wi), </.(w2), -, H^n)) ■ (4) 

The laser field, as defined in Eq. [3j completely determines the dynamics of any 
controlled quantum process, subject to the associated wavefunction ip(t), satis- 
fying the Schrodinger equation: 



V — —fiE{t) cos{ujQt) 



^^ = (Ho + V)m (5) 



where Hq is the field-free Hamiltonian and ^ is the electric dipole moment. 
The modeling of noise on the shaper is equivalent to Eq. |2j assuming additive 
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Gaussian noise on each pixel (independent Gaussian sampling): 

4>iLo) = {<f>{uj,)+Af,{0,el),...,(f>M+Mn{0,el)), (6) 

where (f>{uj) and are the perceived and the ideal pixel values, respectively, 
and each pixel is subject to a noise level of e|.; the latter is assumed to remain 
fixed during the course of the whole experiment. Since this type of uncertainty 
stems from physical disturbances - such as dust or convection currents that are 
responsible for variable refraction indices, and therefore can be modeled as some 
continuous function - the independently sampled Gaussian disturbance is thus 
an approximation. The correlations between disturbances on adjacent pixels may 
be considered in further studies. 

The variations in the input propagates into the output in a highly nonlinear 
manner, due to the complex transformations involved in the process (Eqs.|3]and 
[5|, and yields non-additive deviations with an unknown form. 

(B) Given a quantum observable operator, O^, and given the propagated 
wavefunction ip solving Eq. [Sj a quantum observation is then defined as J7i = 
{4> \Oi \ ip) . The measurement value is assumed to be subject to observation noise, 
corresponding to electronic or thermal fluctuations in the detector (Johnson- 
Nyquist noise), which typically possesses very low noise strength and is mod- 
eled as additive Gaussian deviations, equivalent to Eq. [T] 

The high duty cycle of QC experiments (typically IkHz) permits increased sig- 
nal averaging, which reduces the influence of additive noise sources, such as 
measurement noise, by virtue of the central limit theorem. Thus, given k inde- 
pendent, single-shot measurements, the mean and variance of the observation in 
the presence of measurement noise, jTi, may be described as follows: 



Ji)=Ji, VAR 



k 



(7) 



and given sufficient signal averaging, its contribution is effectively removed. 
While such signal averaging always increases the precision of the QC measure- 
ment, the contribution of non-additive noise sources, such as the propagation of 
^(w) (Eq. [s]), may not be removed, and is of particular interest in this study. 

(C) The third source of uncertainty, with the least impact, is general system 
drift which occurs in a time span of the entire experiment (minutes to hours). 
The observation is then disturbed by some temporal function ^(i): 

j-m^j.^m- (8) 

Fig. [l] summarizes the sources of noise in a typical QC experiment. 



3 The Algorithmic Approach: Multi-Objective CMA-ES 



Following the broad success of the Covariance Matrix Adaptation Evolution 
Strategy (CMA-ES) in single-objective continuous optimization, a multi-objective 
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version has been released [19]. In short, the CMA is a derandomized ES variant 
that has been successful in treating correlations among decision parameters by 
efficiently learning optimal mutation distributions. The MO-CMA relies on the 
elitist (1 + A)-CMA kernel [Hj (typically with A — 1), which had been originally 
designed for it, likely due to the aforementioned studies indicating that elitism 
is essential for efficient multi-objective optimization |39l40j . The elitist CMA 
combines the classical concepts of the (1 + 1)-ES, and especially the success 
probability and the success rule components (see, e.g., [6]), with the Covariance 
Matrix Adaptation concept. 

Explicitly, the set of evolving individuals comprises fj, search points, which 
correspond to /j, independently evolving single-parent CMA mechanisms. Given 
the i*'' search point in generation g, xf\ an offspring is generated by means of 
a Gaussian variation: 

:^^'^^U{^\a^df^). (9) 

The covariance matrices, |c|^^| , are initialized as unit matrices and are 
learned during the course of evolution, based on cumulative information of suc- 
cessful past mutations. The step-sizes, . i ^-^e updated according to the 

so-called success rule based step-size control. The set of parents and offspring un- 
dergoes two MO evaluation phases, corresponding to two selection criteria: the 
first criterion is Pareto domination ranking, followed by the hypervolume con- 
tribution criterion. Fig. [2] illustrates the operation of the MO-CMA algorithm. 
For more details we refer the reader to [19] . 




Fig. 2. Cartoons illustrating the MO-CMA mechanism: [LEFT] The objective space, 
where selection is subject to two criteria: Pareto domination ranking and hypervolume 
contribution [T5] . [MIDDLE] The decision (search) space, where the pre-images of the 
evolving Pareto front are depicted, and simultaneously update as independent (1-1-1)- 
CMA kernels. [RIGHT] A solitary CMA kernel evolving in the decision space of an 
elliptic single-objective model landscape. 
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3.1 Introduction of Noise 

The application of the MO-CMA to MOQC in general, and to the systems under 
investigation in the current study, introduces new aspects to Pareto optimization 
at different levels that have to be addressed. The current framework differs from 
previously studied MO noisy systems in two main aspects: 

— The recorded objective function values (signal measurements) cannot be as- 
sumed to follow a specific distribution; the degree to which the noise on the 
decision parameters propagates into the objective function values is generally 
unknown, and in any case the latter is not additive. 

— Due to the nature of the MO-CMA learning rules, any manipulation or 
replacement of archived solutions is not recommended. This is a common 
rule of thumb for the family of derandomized ES, which rely on cumulative 
information gained from previously selected search points. 

Furthermore, the introduction of noise to the MO-CMA is expected to raise 
additional issues: 

— Single-parent strategies experience difficulties in handling noisy landscapes, 
in comparison to multi-parent strategies: the application of recombination 
in the latter case proved highly efficient in treating excessive noise [42] . 
More specifically, in the context of QC experimental optimization, the single- 
objective CMA was observed in [T7] to fail without recombination, and to 
perform extremely well otherwise, as expected from theory |42| . 

— Elitist strategies support the survival of parents, and are likely to encounter 
scenarios in which highly overvaluated perceived fitness values are kept for 
long periods, causing stagnation (see, e.g., [H]). The issue of fitness dis- 
turbance is expected to become a problem for the MO-CMA, should its 
implementation follow the original algorithm and avoid parental fitness re- 
evaluation. 

Arnold and Beyer [33] considered the aforementioned effects and studied the- 
oretically the local performance of the single-objective (1 + 1)-ES in a noisy 
environment. Here are some of the relevant conclusions of that study: 

1. Failure to reevaluate the parental fitness leads to systematic overvaluation. 

2. Overvaluation is responsible for the different behavior of the elitist single- 
parent strategy, in comparison to other strategies, and may lead to long 
periods of stagnation. 

3. Overvaluation may, nevertheless, be beneficial for the specific homogeneous 
environment of the quadratic sphere in the limit of infinite dimensions. 

4. Occasional parental fitness re-evaluation seems to be superior with respect 
to no re-evaluation at all and to re-evaluation in every generation. 

5. Overvaluation has the potential to render useless success-probability based 
step-size mechanisms. 
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It should be stressed that disturbance of objective function vahies in experimental 
optimization typically cannot be tolerated, and is primarily perceived as a source 
of deception that deteriorates the reliability of the attained results. Also, the 
main focus of the current study is on the attained set of solutions, and on the 
ability to reproduce the perceived fitness values as reported in the algorithm's 
output. In particular, in the MO context, the research goal is to investigate 
the nature of the attained Pareto optimal set, in light of its a posteriori re- 
evaluation. 

3.2 A Proposed Scheme 

Given the conclusions concerning the (l-l-l)-ES outlined in the previous section, 
we would like to propose a modus operandi for our experimental optimization, 
subject to noise, with the MO-CMA. In particular, three different empirical 
scenarios are considered: 

1. Defauh MO-CMA ('D') 

2. Parental fitness re-evaluation every generation ('E') 

3. Occasional parental fitness re-evaluation at every epoch ('O') 

The last scenario aims at achieving a trade-off between low fitness disturbance 
during the run (reliability) versus keeping the number of experimental evalu- 
ations to a minimum. It can also be considered as an attempt to corroborate 
the theoretical results discussed earlier (see the summary of [44| in the previous 
section, and particularly point [4]), upon transferring them to the multi-objective 
framework. 

We set the re-evaluation interval to 10 generations, inspired by a recom- 
mended rule of thumb for the evaluation interval of the step-size in the (1 + 1)-ES 
(see [6] p. 84). 

4 Systems under Investigation 

We present here our selected models for the evaluation of the MO-CMA, which 
comprise model landscapes, a simulated QC system, and two QC laboratory 
experiments. 

4.1 Model Landscapes 

Here we briefiy introduce the model landscapes to be Pareto optimized. They 
include the basic Multi-Sphere model, which is considered to be an elementary 
multi-objective test-case, along with a quantum-oriented model landscape, re- 
ferred to as the Diffraction Grating problem. The latter, which is introduced 
here for the first time as a multi-objective test-problem for the optimization 
community, shares many characteristics with QC problems, such as the nature 
of the decision parameters and some properties of the objective function. At 
the same time, it possesses a quite simple form, requires an extremely short 
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CPU evaluation time, and offers a complete mathematical formulation (e.g., 
the propagation of systematic noise may be analytically derived). Thus, it as a 
particularly attractive test-case for this study, and potentially for other future 
studies, as it offers a practical link to experimental optimization with a 
very low computational cost. 

The landscapes will be optimized subject to a search space dimensionality 
of n = {10, 30, 80}, while we choose to expose the search to noise solely on the 
decision parameters, corresponding to Eq. [2j with the following values: 

el = {0.001, 0.005, 0.01, 0.02, 0.05} (10) 



The Multi-Sphere Model We consider the m-objective quadratic multi- 
sphere as our model landscape to be Pareto optimized in an ?i-dimensional 
search-space (see, e.g., (45j): 



/ (a; - cxf ■ {x - Cl) \ 
{x - c^f ■ {x - C2) 

\(a; - Cmf ■ {x - Cm)/ 



mm, Cl 



/1\ 












(11) 

The shape of the Pareto front is convex, and it is explicitly described for 
m = 2 as follows (see, e.g., [46j ) : 



/2 = 2 1 - 



h 



l/2\ 



h e [0,2] 



(12) 



Upon consideration of noise on the decision variables, the mean of the perceived 
fitness reads 

'~U{x))=h{x) + nel, (13) 



and its variance is described as follows (for the derivation see, e.g., [25]): 



VAR 



(/.( 



(14) 



The Diffraction Grating Problem The Diffraction Grating family of func- 
tions introduces a basic set of optical test-problems for Pareto optimization, 
scalable in dimension and subject to a collection of defining parameters for set- 
ting the Pareto front's curvature. 

Given a diffraction grating optical setup of n slits, defined by the width of 
each slit h and the space between adjacent slits h, and given a spatially uniform 
electromagnetic plane wave illuminating the slits with corresponding phases € 
[0, 27r]"', the intensity on a screen point in the Fraunhofer regime (i.e., far field) 
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transmission 
grating 




Fig. 3. Grapliic illustration of the Diffraction Grating problem setup with 2 slits. In- 
cident light propagates through the slits - which along with the glass play the role of 
a phase function ip - and shines on the screen. The intensity Idg as a function of the 
position q is then recorded, to become a position-based objective function. 



positioned at q reads: 



qb 



qb 



exp (iqhk) ■ exp {i(pk) 

k=Q 
n— 1 n— 1 

2 • ^ ^ COS [qh {£- k) + Aiptk] 



(15) 



fe=0 l>k 



Fig. [3] provides an illustra- 



where ip = {ipo, ^Pi, ■ ■ ■ , ^Pn-i) and Aip^k = Vi- Vk- 
tion for the Diffraction Grating setup. 

Given a set of m competing points on the screen, described by a corresponding 
position vector q € M™, the m-objective Diffraction Grating problem to be 
Pareto optimized is defined as follows: 



( Idg {h,v)\ 
Idg (92, v) 



\Idg iqm,^)/ 



(16) 



The shape of the Pareto front is determined by the positions of the points on 
the screen, and may furthermore be controlled by means of the parameters b 
and h. This problem offers a rich variety of complexity levels, and can easily be 
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extended to many different forms, such as multiple wavelengths, consideration 
of controllable amplitudes, nonlinear screens, 2-dimensional screens, etc. 

Let us consider a setup with b — 1, h ^ A. The intensity values on the 
screen due to optical interferences follow a period, Tq ~ j^, and it is thus 
convenient to consider positions in terms of this period Tq. In our calculations 
we shall consider the maximization of the intensity at position zero, qo = 0, 
competing with the maximization of the intensity at the following positions: 
q = {0.1 -Tq, 0.25 -T,, 0.5 • TJ. 



+ ? 


= o.ir. 


X 1 


= 0.25 • Tg 


O 1 


=0.5-Tg 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 

Idg (go = 0) 



Fig. 4. Approximate Pareto fronts, attained by the MO-CMA, of the competition 
between the intensity at go = to the intensity at each one of the positions q = 
{0.1 • Tq, 0.25 • Tq, 0.5 ■ Tq} - formalized as bi-criteria problems following Eq. 16 with 
n = 10 phase points. Given the fixed optical setup of the problem (6=1, h = 4), the 
positions of the competing points on the screen dictate the curvature of the Pareto 
front. 



For illustration, approximate Pareto fronts (attained by the MO-CMA) of the 
competition between the intensity at go = to the intensity at each one of the 
positions q — {0.1 • Tq, 0.25 • Tq, 0.5 • Tq} - formalized as bi-criteria problems 
following Eq.[l6]with n = 10 phase points - are depicted in Fig. [2] In addition, an 
approximate Pareto surface, obtained by the steady-state MO-CMA, presenting 
the competition between intensities of points positioned at qo — 0, qi — 0.25 • Tq, 
and (72 = 0.5 • Tq - formalized as a tri-objective problem (Eq. 16 1 with n — 10 
phase points - is depicted in Fig. [5j 
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In what follows, this study will focus on the bi-criteria case of qi — 0.5 • Tq = 
f , i.e., 

fl = Idg (0, if) — > max 

f T ^^^^ 

This test-case has a Hnear Pareto front; see Appendix [A] f or the proof. 

Noise will be modeled here as with a QC phase function (Eq. l6|, i.e., additive 
Gaussian variations on each phase coordinate: 

<p^y,+Af{0,ell). (18) 

Upon consideration of the noise propagation, the mean and the variance of the 
perceived fitness can be analytically derived (see Appendix |b]) . The mean may 
be presented in a compact form, 



(19) 



revealing both additive as well as multiplicative components to the disturbed 
objective function values. The variance, although possessing a closed analytical 
form, cannot be presented in a compact form, but rather in terms of explicit 




summation (Eqs. 50 and 53 are given in Appendix |B 



4.2 Simulated Quantum Control System: Molecular Alignment 

We consider the QC application to dynamic molecular alignment, which has been 
widely investigated in the past by means of noise-free simulations optimized by 
EAs (see, e.g., [47l48j ). The time evolution of heteromolecular diatomic align- 
ment is quantum mechanically computed with the system starting either in the 
ground rotational level (i.e., at zero temperature), or in a Boltzmann distribution 
of initial states. The primary objective is maximization of molecular alignment, 
quantified by the cosine-squared observable, Oi = cos^(6'), which considers the 
angle 9 of the molecular axis with respect to the laser polarization axis. Fig. |6] 
provides an illustrative overview of the numerical process. This single-objective 
form was extended to a bi-criteria framework |49|50j . considering additionally 
the demand for low-intensity pulses, satisfied by minimizing second harmonic 
generation (SHG). The bi-criteria formulation is thus posed as obtaining the 
Pareto front given the following objectives: 

fl = (cos^(f?)) — !■ max 

/■°° . (20) 

f2^ SHG{E{t)) ^ \E{t)\'^dt ^min. 

J —oo 

For the explicit definition of the cosine-squared observable in /i we refer the 
reader to [ST], while the electric field dependence in /2 follows the formulation 
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Fig. 5. The tri-criteria Diffraction Grating problem: approximate Pareto surface, at- 
tained by the steady-state MO-CMA, of the competition between the intensities at 
go ~ 0, <li ~ 0.25 • Tq, and §2 = 0.5 ■ Tq - following Eq. [16] with n = 10 phase points. 



in Eq. [3j The values of both /i and /2 are normahzed to lie on the interval 
[0, 1]. This bi-criteria molecular alignment problem was previously investigated 
only for the variant considering a distribution of initial rotational states |49l50j . 
We shall study the problem variant starting in the ground state [48], which 
constitutes a simulation with a duration of 5sec per single evaluation. Even upon 
parallelization of the MO-CMA, we are still facing computationally expensive 
calculations, which will practically limit the employment of various strategies 
and in repeating runs to a certain degree, as will be described. We consider a 
discretization of n = 80 points for the phase function. 

We consider the introduction of noise to the phase pixels (Eq. [6]), and in- 
corporate it into the simulation. In order to evaluate the effect of this noise on 
the objective values fi and /2, the Gaussian variation has to be explicitly prop- 
agated through the Fourier transform and the Schrodinger equation. Such an 
analytical evaluation is highly complex (especially for /i), generally unknown, 
and exceeds the scope of this study. 

It should be noted that the bi-criteria alignment problem was Pareto opti- 
mized by different variants of the NSGA-II [H] and of the SMS-EMOA fSDl, and 
will be introduced here to the MO-CMA algorithm. 
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Fig. 6. An overview of the numerical modeling of molecular alignment. The control 
function is the spectral phase (circled, top left), the amplitude function is fixed and 
approximated by a Gaussian (bottom left). The shaping process generates the electric 
field, E{t) (center), corresponding to Eq. |3] The "Schrodinger Box" of the alignment 
observable represents the numerical calculation of the interaction between the electric 
field with the molecules, based on the specified quantum dynamics equations. The 
revival structure (right) is the observed simulated behavior of the molecules, upon 
which the yield value is based. 



4.3 Experimental QC System I: Molecular Ion Generation 

We consider the Pareto optimization of a QC experimental system in order to 
examine the conflict between two competing quantum mechanical observables. 
Total ion signal J^ion resulting from multi-photon ionization of nitromethane 
with shaped, femtosecond pulses is examined with the goal of discovering a 
unique set of ionizing pulses. However, due to the high photon numbers (~ 8 
photons at 800nm) required for single pulse ionization, ion generation is pre- 
dominantly dictated by pulse intensity, which obfuscates sensitivity to detailed 
temporal control field structure. This inherent intensity dependence is removed 
by additionally considering SHG", where a = 2.5 in the present circumstance, 
as shown later in the inset of Fig. [15] for the unshaped reference pulse. Towards 
this end, we seek to maximize the ion signal with low-intensity pulses, which 
naturally results in a conflict between j7/on and SHG": 

fi = Jion — > max 
/2 = SHG"" — > min 

The search is carried out by means of rt = 80 independent phase pixels (see Eq. 
[4| , while Jion is recorded with a mass spectrometer and SHG is monitored with 
a two-photon diode. 

4.4 Experimental QC System II: Molecular Plasma Generation 

As an extension of the molecular ion generation system, and as an application 
of the aforementioned Optimal Dynamic Discrimination concept, we consider 



(21) 
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here an equivalent conflict between competing plasma channels. Total free elec- 
tron number Sfpiasma resulting from multi-photon ionization of nitromethane 
with shaped, femtosecond pulses is diagnosed with radar scattering. Shaping is 
performed with the goal of discovering a unique set of ionizing pulses which 
discriminate against background plasma generation. Here, also, due to the high 
photon numbers required for single pulse ionization, electron generation is pre- 
dominantly dictated by pulse intensity. Equivalently, we seek to explore the 
conflict between J^piasma maximization and SHG minimization, in an effort to 
discover unique, non-intensity dependent ionizing pulses: 

/l = Jpiasma > max 

(22) 

/2 = SHG — min . 

The search is carried out by means of 71 = 80 independent phase pixels (see Eq. 
[4|, while J^piasma is recorded with a microwave transmitter/receiver and SHG 
is monitored with a two-photon diode. 

The reader should keep in mind that despite some similarities in the two 



aforementioned laboratory systems - i.e.. Molecular Ion Generation (Section 4.3 1 
versus Molecular Plasma Generation (Section |4.4[ ) - they possess very different 
experimental designs, and most importantly, they are subject to fundamentally 
different underlying physics. Table [T] summarizes the problems investigated in 
this study. 



Table 1. Summary of Systems under Investigation 



Simulations: Model Landscapes 



Problem Name 
Multi-Sphere Eq. 
Diffraction Grating Eqs 



For mul ation Dimensionality Noise Levels 

n = {10,30,80} el = {0.001,0.005,0.01,0.02,0.05} 
n = {10, 30, 80} e| = {0.001, 0.005, 0.01, 0.02, 0.05} 
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Real-World Simulator 
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Problem Name Des crip tion Dimensionality Noise Levels 

Molecular Alignment Eq. [20| n = 80 e| = {0.001, 0.005, 0.01, 0.02, 0.05} 



Laboratory Experiments 

Problem Name Des crip tion Dimensionality Measured Noise Level 

n = 80 e| « 0.01 

n = 80 el ^ 0.01 



Total-Ion Generation Eq. 
Plasma Generation Eq. 
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5 Practical Observations 

We describe here our observations of the three frameworks specified in the pre- 
vious section: Model landscapes, QC simulations, and QC experiments. Towards 
this end, we adhere to the structured reporting scheme suggested by Preuss 
|52| . starting by posing the scientific question to answer. Each framework is 
treated by means of relevant methodologies, which depend upon the research 
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question as well as upon the practical constraints (computational resources, ex- 



perimental considerations, etc.). Section 5.1 focuses on the performance of the 
MO-CMA on the Multi-Sphere landscape subject to noise. Section [5^2] considers 
the performance of several EM OA on the Diffraction Grating problem. Section 
|5.3|reports on results of the simulated Molecular Alignment problem, and finally. 



sections 5.4 and 5.5 present laboratory results of the Molecular Ion Generation 
and Molecular Plasma Generation problems, respectively. 

Pre-Experimental Planning. The MO-CMA code relies on the Shark Li- 
brary release 2.2.]j^[55j. The simulated systems]^ are optimized by means of an 
extended MPI-based parallel implementation to the Shark code, while the labo- 
ratory employs an extended Lab View version, which relies on Shark DLL's. The 
default parameters are kept, with a total population size of either fis — Xs — 100 
search points for the simulations, or (J-l — Xl = 50 search points in the labo- 
ratory. Random initialization of search points is carried out uniformly in the 
interval [—10, 10]" for the Multi-Sphere cases, and in [0, 2n]" otherwise. The ini- 
tialization in the experimental systems also relies on seed search points, which 
were obtained in single-objective CMA-ES runs addressing a tailored ratio ob- 
jective function. 

The presentation of the results will include the archived perceived fronts 
attained by the MO-CMA for all frameworks under investigation. For the two 
simulated frameworks, we are in a privileged position to reevaluate archived 
solutions with noise-free objective functions, and thus we shall present also the 
ideal fronts, which are calculated a posteriori. 

We would like to stress the fact that the perceived fronts, due to the elitist 
strategy in use, are expected to represent the tail of the disturbance distribution, 
as projected on the archived solutions. It is important to consider to what extent 
the attained perceived front may be reconstructed de facto given the archived 
solutions. Therefore, we will generate statistical samples of each archived solu- 
tion, subject to the same noise conditions, and present additionally the nature of 
the obtained distributions. We consider this a direct indication of the usefulness 
of the archived solutions. 



5.1 Preliminary: MO-CMA on the Multi-Sphere Landscape 

Research Question. How does noise on the decision parameters affect the 
MO-CMA performance, if at all, and do any of the considered schemes of three 



parental re-evaluation scenarios (Section 3.2 1 handle noise better? 
Performance Criteria. In order to assess the quality of the obtained Pareto 
fronts in the different noisy test-cases, we shall consider two performance cri- 
teria. Given the attained hypervolume indicator values, Vi |54l55j (also known 
as 'S-Metric' [55] or 'Lebesgue Measure' [57]), the first criterion is their relative 
deterioration with respect to the hypervolume of the Pareto front obtained in 

^ http:/ /shark-project. sourceforge.net/ 

* A software package of the Diffraction Grating problem will be provided by the au- 
thors upon request. 
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noise- free conditions, V^^^o- This criterion will be assessed numerically, for which 
we set up and test a corresponding quantifier: 

The second criterion is the spatial distribution of the attained front, for which we 
set up and test a corresponding quantifier. In particular, given a final population 

of size n, I f'"^^ I , sorted by means of partial order, let us consider its value 

I J k—l 

with respect to a reference noise-free population, {Pk}k^i, which toward this end 
represents a desired distribution of points along the front: 



A* II _ „ I 

Pk\ 



k=l 



In essence, values given by Eqs. 23 and 24 reflect the degrees of deterioration 
in the hypervolume and the spatial diversity, respectively, with respect to the 
noise-free simulations. 



Numerical Results Setup. We consider here the numerical results of the 
various simulations on the Bi-Sphere model landscape. While the number of 
function evaluations per scheme varied, due to the parental re-evaluation proce- 
dure, the number of total iterations was fixed per search space dimensionality: 
nuruiter = {10^,2 • 10'',5 • 10*} for n = {10,30,80}, respectively. Those val- 
ues were set based on preliminary runs, in which the MO-CMA converged to a 
highly-satisfying front, with minimal error from the true Pareto front, and with 
a uniform distribution of points. For the hypervolume calculations, a reference 
point at [2, 2] is considered. 

Experimentation/Visualization. We focus on presenting statistical analyses 
of specific test-cases, comparing the 3 different MO-CMA schemes. Overall, tak- 
ing into account the a posteriori calculation, we shall have two sets of results per 
procedure. Fig. [7] depicts the statistical box-plots for Ay values of the Multi- 
Sphere landscape, taking into account only converged points in the box [0,2]^ 
in the objective space, for three test-cases: n = 10 with = 0.05 (top), n = 30 
with el = 0.02 (middle), and n = 80 with el = 0.01 (bottom). Fig. ^depicts the 
equivalent box-plots for the Ajj calculations (considering all 30 runs per case). 

Discussion While the perceived fronts given as output by the MO-CMA pro- 
vide fair Pareto approximations, with some expected error due to the presence 
of noise, an examination of the actual archived solutions reveals an entirely dif- 
ferent picture. When exposed to noise on the decision parameters, the default 
MO-CMA is observed to lack population diversity in the objective space for all 
search space dimensions under investigation. This effect becomes evident upon 
the a posteriori noise-free evaluation of the archived solutions: the outcome is 
several clustered points along the perceived front, as depicted in Fig. [9j The 
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Fig. 7. Box-plots of Z\v values (Eq. 23 1 
over all converged Multi-Sphere runs, of 
three test-cases: n = 10 with = 0.05 
[top], n = 30 with el = 0.02 [middle], 
and n = 80 with el = 0.01 [bottom]. The 
perceived fronts of the three optimization 
procedures, corresponding to the three 
parental re-evaluation scenarios (Section 
3.2 1, are noted as {D, E, O}. The ideal 
fronts (noise-free evaluation of the Pareto 

sets) are noted as |£>, E, o|. Each case 
consists of 30 runs. 



Fig. 8. Box-plots of Ad values (Eq. 241 
over all converged Multi-Sphere runs, of 
three test-cases: n = 10 with el — 0.05 



[top], n = 30 with el = 0.02 [middle], 
and 71 = 80 with el =0.01 [bottom]. The 
perceived fronts of the three optimization 
procedures, corresponding to the three 
parental re-evaluation scenarios (Section 
0|, are noted as {D, E, O}. The ideal 
fronts (noise-free evaluation of the Pareto 

sets) are noted as E, o|. Each case 
consists of 30 runs. 
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lack of diversity continually worsens as the expected disturbance increases, i.e., 
higher noise strength and higher dimensionality lead to increased clustering. Fig. 
[8] depicts box-plots for the Ajj values of 3 Bi-Sphere test-cases. While the raw 
A]j values do not reflect the degree of discrepancy by themselves, it is impor- 
tant to consider those values with respect to the perceived front of the default 
MO-CMA, which typically obtains a fair approximation to the true front given 
the disturbance. This effect may also be observed in Fig. [7j when noticing the 
considerable counter-intuitive differences in the Ay values between the default 
MO-CMA ('I?') and its a posteriori noise-free evaluation ('I?'). 

The proposed explanation for the observed lack of diversity is the following. 
During the run, search points which lead in the progress towards the Pareto front 
generate offspring by means of Gaussian sampling (Eq. [9]) . Offspring with good 
positions with respect to the front, especially whose disturbed fitness values lie 
along the currently progressing front, are selected, and their decision parameters 
are archived. While the perceived offspring's point in the objective space may 
represent a promising coordinate with respect to ranked domination as well as to 
hypervolume contribution, its pre-image in the decision space is merely a small 
deviation from the original parent. In practice, leading individuals take-over the 
population, since generating offspring by means of small mutations in combina- 
tion with the noise disturbance is sufficient to span a fair distribution along the 
Pareto front. This statement was numerically assessed by explicitly calculating 
the expected distribution with the analytical forms of Eqs. [13] and [M] and it 
was furthermore corroborated with the sampling of the actual archived Pareto 
optimal set of an MO-CMA run. The aforementioned calculations are depicted 
in Fig. |9] which provides a clear picture - the obtained clusters are the 
minimal configuration of points for sampling the entire Pareto front 
with the current noise level, and moreover, the perceived front can 
indeed be reconstructed by elitist selection of the attained statistical 
sample. It is also evident from further calculations that the number of clusters 



increases with the reduction of noise disturbance, as expected from Eqs. [13| and 
[l4l This clustering effect may be considered as a multi-objective generalization 
to the systematic overvaluation effect, as discussed by Arnold and Beyer for the 
single-objective case in [33]. We thus claim that fitness disturbance in multi- 
objective optimization is responsible for the low objective space diversity in the 
archiving mechanism of the MO-CMA. 

As a second routine employed, parental re-evaluation every generation clearly 
hampered the performance of the default MO-CMA. The attained solutions con- 
stitute worst quality sets, when compared to the default procedure, for all the 
different test-cases under investigation. This poor performance may be clearly 
observed in Figs. [7]and|8] when considering / '£". The explanation for this 
behavior is a stochastic disturbance to the archiving mechanism, which has a 
direct negative impact on the consistency of the selection phase. 

The third routine, MO-CMA with occasional parental re-evaluation (every 
10 generations), seems empirically to be the best solution for the systematic dis- 
turbance problem. While low population diversity, as assessed with Ad values. 



22 



Ofer M. Shir et al. 



is still observed upon the a posteriori noise-free evaluation of the archived solu- 
tions, the attained clusters are bigger in size, and closer to the true Pareto front. 
Essentially, the archived solutions of this procedure are of the highest quality 
when reconstructed aposteriori in comparison to the other procedures (see 'O' 
/ '6' in Figs. [7] and ^ . The perceived Pareto front is typically not as good as 
the one attained by the default MO-CMA, but unlike the default procedure, the 
a posteriori noise-free evaluation yields a better Pareto front in comparison to 
its perceived front, and especially better than the post-default front. This effect 
is also visually apparent when exploring the box-plots of both quantifiers and 
noting the inversion of roles: while is always of higher quality than '£)', 'O' is 
of lower quality than 'O'. We conclude that in line with the single-objective sce- 
nario, occasional parental fitness re-evaluation seems to be superior with respect 
to no re-evaluation at all and to re-evaluation in every generation. 

Reference Algorithms We considered additional standard EMOA as reference 
methods to the MO-CMA, in order to observe their behavior on the Multi- 
Sphere model landscape, subject to the current modeling of noise. We carried out 
simulations on similar test-cases with the NSGA-II [5] as well as with the SMS- 
EMOA [5H]. We employ Deb's operators and his defaults settings for the NSGA- 
i:0 Regarding the SMS-EMOA, we follow the settings described at [7f\ The 
population sizes are similar to those employed by the MO-CMA. These settings 
hold for the application of both NSGA-II and SMS-EMOA throughout the entire 
study. Typical runs of both algorithms on the case of n = 10 with = 0.01 
are depicted in Fig. |10[ presenting the perceived fronts versus the a posteriori 
noise-free evaluation of the attained Pareto optimal sets. The NSGA-II attained 
a perceived Pareto front which constitutes a good approximation to the true 
front, and at the same time, the noise-free reconstruction of the Pareto optimal 
set provides a reasonable front. The SMS-EMOA, on the other hand, attained a 
perceived Pareto front which offers an excellent approximation to the true front, 
and upon the noise-free re-evaluation of the Pareto optimal set the reconstructed 
front is observed to lose its diversity to some extent. It should be stressed that the 
absolute 'clustering effect' within the archiving mechanism, which was typical 
of the MO-CMA, was not observed for these reference algorithms. This might 
reflect the difference between an algorithm which is clearly designed for learning 
distributions (i.e., employing statistical learning), such as the MO-CMA, versus 
EMOA with traditional evolutionary core mechanisms, which evidently operate 
in a naive way. Overall, in terms of the capacity to reconstruct Pareto information 
out of the archived solutions, SMS-EMOA seems to perform best on the Multi- 
Sphere noisy model landscape. A more comprehensive performance comparison 
between these three EMOA will be carried out in the following section with 
regard to the Diffraction Grating model landscape. 



^ Source code of the NSGA-II algorithm used in this work was downloaded from the 

KanGAL homepage: http://www.iitk. ac.in/kangal/j 
® Source code was provided by Michael Emmerich 
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• Perceived Front 




0.0 0.5 1.0 1.5 2.0 



Fig. 9. Statistical sampling of the Pareto set attained by the MO-CMA on the Multi- 
Sphere with n = 10 at = 0.01. The perceived Pareto front constitutes an excellent 
approximation to the true front, and the a posteriori noise-free evaluation of its pre- 
images yields clusters along the front, whose sampling subject to the same noise level 
yields the depicted cloud of points. The ellipses represent the disturbance distributions, 
centered about the mean with twice the standard deviations as axis, based upon the 
analytical forms of the perceived fitness in Eqs. |13| and |14| It is clear from these results 
that the clusters are the minimal configuration of points for sampling the entire Pareto 
front, subject to elitism, with the current noise level. 



Noisy Tri-Sphere Simulations Finally, we tested the behavior of the MO- 



CMA on the Tri-Sphere case (Eq. 11 with m = 3). Toward this end, we employed 
a steady-state implementation which reduces the extensive complexity of the 
hypervolume calculations. We provide here a brief qualitative description of our 
observations. The MO-CMA obtained a good approximate Pareto surface for 
the noise-free problem. Upon consideration of systematic noise on the decision 
parameters, as done in the Bi-Sphere case, the diversity loss effect in the archiv- 
ing mechanism of the decision space is not observed to be significant any longer, 
even at high noise levels of, e.g., = 0.05. We propose the following explanation 
for this observation: given the selection mechanism of the MO-CMA, treatment 
of an additional objective reduces the selection pressure. Lower pressure may 
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Fig. 10. Typical runs of the reference EMOA on the noisy Multi-Sphere case of n = 10 
with el = 0.01: [LEFT (a)] NSGA-II versus [RIGHT (b)] SMS-EMOA. The figures 
depict the perceived fronts, the a posteriori noise-free evaluation of the Pareto optimal 
sets, and the analytical Pareto front (Eq. 12 1. 



thus reduce the probabihty of take-over, which was our understanding of the 
mechanism for the 'clustering effect'. 



5.2 Diffraction Grating: Extensive Performance Comparison 



Rather than considering the individual performances of the 3 MO-CMA schemes, 
we present a comprehensive performance comparison between the default MO- 
CMA, SMS-EMOA, and NSGA-II on the Diffraction Grating problem in several 
dimensions and at various noise levels. As a secondary research question, we aim 
at reporting on the MO-CMA behavior on this problem. 

Let us begin by qualitatively describing the MO-CMA behavior on this search 



problem, in light of the observation reported in Section 5.1 Fig. 11 depicts typ- 
ical results of the MO-CMA on two variants of the Diffraction Grating problem 
with n = 10 phase points at two noise levels. Equivalent to Fig. [9) the Pareto 
sets are reconstructed a posteriori in noise-free evaluations, then statistically 
sampled at the same noise levels of the evolutionary run, and compared to the 
perceived Pareto fronts, given as output by the algorithm. As a reference, the el- 



lipses representing the noise distribution arc plotted, according to Eq. 19 (mean) 
and Eq. 50]|53 (variance; see Appendix |b]). It is straightforward to observe the 
'clustering effect' in the archiving mechanism, similar to the one occurring in 
the Multi-Sphere case. 



Numerical Results Setup. We consider here simulations on a specific case of 
the Diffraction Grating problem (Eqs. [Ts] and 17 set up with 6=1, /i = 4), in 
search space dimensions of n = {10, 30, 80}, and at noise levels given by Eq. 



10 We fix the total number of function evaluations per search space dimension- 
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Fig. 11. Statistical sampling of the Pareto optimal sets attained by the MO-CM A for 
two variants of the Diffraction Grating problem with n = 10 phase points. Equivalent 
to Fig. |9j the Pareto sets are reconstructed by means of noise-free evaluation, and 
compared to their statistical sampling at the noise level of the evolutionary run, as well 
as to the perceived Pareto fronts, given as output by the algorithm. As a reference, the 
noise distributions are depicted according to the analytical results of Eq. [l9] (mean) and 



Eqs. 50 and 53 (variance; see Appendix \B\ . [LEFT, (a)] Maximization of Idg{<1o = 0) 
versus loaiq = 0.1 ■ T,) at el = 0.01. [RIGHT, (b)] Maximization of iDoiqo = 0) 
versus loaiq = 0.25 ■ T,) at el = 0.05. 



ality: numevaU = {lO^, 2 • 10<^, 5 • 10^} for n = {10, 30, 80}, respectively. For the 
hypervolume calculations, a reference point at [0, 0] is considered. 
Experimentation/Visualization. Next, we shall consider the performance of 
the three EMOA on the given Pareto problems, considering the hypervolume 
indicator as the performance criterion. Based on the analytical expressions of 
the Pareto front for this problem, given in Appendix [Aj the hypervolume of the 
true front is O* = 0.47482. Table [2] presents the mean and standard-deviations 
of the hypervolume calculations over 30 runs of the attained Pareto fronts for 
the various test-cases. The table contains the hypervolume values for the per- 
ceived fronts, as well as for the noise-free a posteriori fronts. Table [3] provides 
the Mann- Whitney U-test calculations for the pairwise algorithm comparisons 
corresponding to the test-cases of Table [2j 



Discussion Given the numerical results in Table [2] and the statistical tests 
in Table [3] we suggest the following observation: while the MO-CMA achieves 
superior hypervolume values on the 10-dimensional case, there is no clear winner 
on the 30-dimensional case (see U-tests), and finally, the SMS-EMOA is the 
winner on the 80-dimensional cases. In the vast majority of the cases, the NSGA- 
II is outperformed by its competitors. 

We speculate whether the poor performance of the MO-CMA in the high- 
dimensional cases in comparison to the SMS-EMOA is due to an insufficient 
budget of function evaluations. Upon granting the MO-CMA additional func- 
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Table 2. Hypervolume Calculations: the Grating Diffraction Landscape {b — 1, h = A; 
qo = 0, q = 0.5 • T,; n — {10,30,80}). Mean and standard-deviations over 30 runs, 
considering a reference point at (0, 0). Based on the analytical expressions of the Pareto 
front for this problem, given in Appendix [Xj the hypervolume of the true front is 
0* = 0.47482. 



n=10 



Noise 
Strength 


MO-CMA-ES 


SMS-EA-IOA 


NSGA-II 


perceived | a posteriori. 


perceived | a po.'steriori 


perceived | a posteriori 


4 = 


O.47476±0,0001 


0.47476 ±0,0001 


0.47443±0.0004 


0.47443±0.0004 


0.31258±0.0ee9 


0.31258±0.0669 


= 0.001 


0.4742O±0.0001 


0.47420 ± 0.0004 


0.47339±0.0016 


0.47219±0.0025 


0.39245±0.0472 


0.37274±0.0575 


ei, = 0.005 




0.47398±0.0002 


0.47158±0.0031 
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0.34559±0.0638 
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= 0.02 
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0.40789±0.0357 


0.37518±0.0504 


e'i, = 0.05 
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0.44705±0.0158 


0.46488±0.0049 


0.42055±0.0259 


0.42755±0.0317 


0.38169±0.0433 



n — 30 
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0.46864±0,0034 


0.46Se4±0.0034 


0.24917±0.0291 


0.24917±0.0291 


= 0.001 



0.40817±0.0622 


0.40395±0.0647 


O.46O8O±0.0123 


0.45902±0.0134 


0.30131±0.0347 


0.28343±0.0372 


e-^ = 0.005 


0.42474±0.0435 


0.41412±0.0482 


0.44955±0.0117 


0.44130±0.0152 


0.28015±0.0428 


0.26685±0.0435 


= 0.01 
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ej. = 0.02 


0.41905±0.0435 


0.39627±0.0531 
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0.39920±0.0316 


0.32415±0.0362 


0.29869±0.0377 


e| = 0.05 


0.40997±0.0392 


0.37579±0.0466 


0.41614±0.0139 


0.37495±0.0240 


0.35147±0.0252 


0.31039±0.0335 



n = 80 



Noise 
Strength 
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perceived | a posteriori 


perceived | a posteriori 




e^^ = 


0.35875±0.0515 


0.35875±0.0515 


0.45796it0.0104 


0.45796±0.0104 


0.20782±0.0440 


0.20782±0.0440 


= 0.001 
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= 0.005 




0.26278±0.0458 
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ei. = 0.01 
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= 0.02 


0.25478±0.0482 
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0.25967±0.0350 


0.24304±0.0378 


ej. = 0.05 


0.22611±0.0358 


0.19538±0.0369 


0.34807±0.0250 


0.32100±0.0308 


0.28492±0.0333 


0.25748±0.0386 



tion evaluations for the high-dimensional cases this speculation is indeed cor- 
roborated. We carried out 30 independent runs for the noise-free test cases of 
n = 30 and n = 80, with 10 times the original budget of function evaluations, 
i.e., with 2 • 10^ and 5 • 10^ function evaluations, respectively. For the n = 30 
test-case the MO-CMA obtained a mean hypervolume value of 0.47465, whereas 
for the n = 80 test-case it obtained a mean hypervolume value of 0.46089. Fig. 
[T2| depicts statistical box-plots describing the miscellaneous runs granting the 
MO-CMA additional function evaluations for the high-dimensional noise-free 
cases, presenting the attained hypervolume values at specific milestones along 
the runs. As stated earlier, it is indeed shown that the MO-CMA is slower than 
the SMS-EMOA for those problems, but it is capable of eventually converging 
to a good approximate front, given sufficient function evaluations. 

The empirically observed slow progress rate may be attributed to the self- 
adaptation mechanism which is typically responsible for the relatively long learn- 
ing period of the CMA-ES when compared to other strategies, e.g., ES with 
fewer strategy parameters [59l60j . Overall, it seems that employing the strong 
search-engine of the CMA does not pay off on the Diffraction Grating prob- 
lem upon consideration of the reduced convergence speed in comparison to the 
SMS-EMOA. 
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Table 3. Mann- Whitney U-Test Calculations: the Grating Diffraction Problem (6=1, 
/i = 4; go = 0, q = 0.5- T,; n = {10, 30, 80}). A comparison is drawn from the numerical 
results of the 3 algorithms in the various test-cases, considering a null hypothesis Hq 
stating that there is no performance difference in terms of the attained hypervolume, 
versus a hypothesis Hi stating that two algorithms have significantly different perfor- 
mance. Accordingly, a table symbol of ± indicates a rejection of the null hypothesis at 
the 5% significance level, whereas a symbol of ~ indicates a failure to reject the null 
hypothesis at the 5% significance level. -I- refers to a statistically significant outper- 
formance of the left-side algorithm over the right-side algorithm, and — indicates the 
reverse scenario. 
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5.3 Molecular Alignment Simulations 

We consider the detailed effect of pixel noise on the quantum observables and 
the overall MO-CMA performance. 



Performance Assessment. In the context of molecular alignment (Eq. 20), /i 
is of particular interest, and thus is considered as the primary objective. The 
maximally attainable theoretical upper bound that can be supported by the 
utilized rotational states used here was found to be 0.9863 |48j, but the best 
known single-objective /i yield within the current bandwidth discovered by an 
ES was reported to be 0.962 |48], with a corresponding /2 value of 0.154. The 
nature of the conflict between fi to /2 is generally unknown, and we shall use 
our noise-free runs as a reference Pareto front for the runs on noisy systems. 
Setup. Due to computational limitations, we set a limit of 10 runs per test- 
case. Preliminary runs of MO-CMA, SMS-EMOA and NSGA-II on the noise- 
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Fig. 12. Granting the MO-CMA additional function evaluations for the noise-free 
Diffraction Grating problem. Statistical box-plots of 30 independent runs, present- 
ing the attained hypervolume values at specific milestones along the run, with up to 
10 times the original budget of function evaluations. [LEFT, (a)] n = 30, vsrhere the al- 
gorithm obtains the maximally attainable hypervolume in all runs, without exception, 
after 2 • 10^ function evaluations; [RIGHT, (b)] n — 80, where the majority of the runs 
obtain the maximally attainable hypervolume after 5 • 10^ function evaluations. 



free simulation are carried out as an introductory comparison. Furthermore, we 
will take into account systems with noisy controls subject to the noise strength 
values of Eq. [lO] Each run is limited to 1000 iterations. 

Preliminary: EMOA Noise-Free Comparison. The noise-free runs yielded 
disconnected local Pareto fronts, which offered limited coverage of the objective 
space per run. This may suggest that the search space is broken into separate 
regions, partitioned by barriers, possibly due to the inherent constraints on the 
system, e.g., the bandwidth, the discretization, etc. We reconstructed a single 
Pareto front from these runs, referred to here as the best known Pareto front. 
The shape of the attained front indicates that the conflict is rather soft, as high 
fi values may be obtained while keeping /2 values extremely low. There seems to 
be no considerable pay-off in fi when unleashing /2. Furthermore, from a prac- 
tical perspective one may argue that this conflict is irrelevant, as the observed 
/2 values are sufficiently low. It should also be noted that /i values of ssO.96 
could not be attained in these runs; the best obtained value was fi = 0.947, cor- 
responding to /2 = 0.165. This observation may be linked to previous reports 
on the single-objective CMA-ES applied to this problem [35], investigating its 
performance in maximizing /i subject to various parametrizations. In particular, 
the so-called 'plain' parametrization, where the decision variables correspond to 
the phase function pixels, was observed to be inferior in comparison to specific 
polynomial-based configurations, where the decision variables played the role of 
coefficients of complete-basis functions. In |48| . following an empirical compari- 
son, the Hermite polynomials were reported to perform best. Here, we carried out 
additional calculations, employing the Hermite parametrization, in order to as- 
sess the latter observation. The results, which are depicted in Fig. [Tsj generalize 
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Fig. 13. Attained Pareto fronts on the noise-free molecular alignment simulation of 4 
EMOA routines: MO-CMA with 'plain' parametrization (decision variables are directly 
addressed as phase points), MO-CMA with Hermite parametrization (decision variables 
correspond to coefficients of the first 40 Ifermite polynomials, spanning altogether the 
phase), SMS-EMOA, and NSGA-II. Each front is reconstructed of 10 runs per routine. 
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the observation reported in |48j into the bi-criteria picture, confirming that the 
MO-CMA is capable of attaining fi values of ssO.96 when special configurations 
are in use. Moreover, it confirms that the inherent advantage of the Hermite 
parametrization in terms of /i values translates into a trade-off with slightly 
higher /2 values. Concerning the competing SMS-EMOA and NSGA-II algo- 
rithms, it is clearly observed that they present inferior performance, especially 
with respect to the coverage of /i values. In total, their results are disappointing, 
but at the same time are in some consistency with previous observations on a 
different variant of this problem (see, e.g., [IS]). 

Observation: MO-CMA on Noisy Systems. In what follows, we consider 
the MO-CMA alone on the noisy alignment problem. When subject to noise, the 
MO-CMA seems to perform well, especially with its default procedure, in ob- 
taining fair Pareto fronts, in comparison to the noise-free simulations. As in the 
noise-free case, the attained fronts were typically broken, and we reconstructed 
them into a single front for their presentation. In some cases, the perceived Pareto 
fronts of the noisy system dominated the best known front, and the a posteriori 
noise-free evaluation of the archived phase functions introduced a local improve- 
ment to the best known front. This is an example of a scenario in which fitness 
overvaluation has the potential to enhance the search. However, the reproduction 
of the Pareto front by evaluating the Pareto optimal set typically failed, sug- 
gesting that decision space information was lost, as was observed on the model 



landscape. Fig. 14 a] depicts the attained front of the default MO-CMA proce- 
dure in a noisy system of e| = 0.01. The plot contains the reconstructed Pareto 
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Fig. 14. The attained reconstructed Pareto front of the default IvIO-CMA on the noisy 
ahgnment simulation with e| = 0.01. [LEFT (a)] The reconstructed perceived front 
of 10 runs, the best known front, the a posteriori noise-free evaluation of the Pareto 
optimal set, and the a posteriori noisy sampling of the Pareto optimal set. [RIGHT 
(b)] Statistical examination of the a posteriori sampling of selected points of the Pareto 
optimal set. Each sampling set comprises 100 evaluations at the noise level of e| = 0.01. 
The ellipses represent the disturbance distributions, centered about the mean with 
twice the standard deviations as axes, based upon statistics of the attained data. As 
in the model landscapes (see, e.g.. Fig. |9|, the perceived front constitutes an elitist 
selection of these distributions. The reader should mind the different horizontal scaling 
of the two panels. 



front of 10 runs, the best known front, the a posteriori noise-free evaluation of 
the Pareto optimal set, as well as the noisy sampling of the Pareto optimal set. 
Close examination of the a posteriori sampled data and their grouping towards 
the perceived front reveals interesting insight into the noise propagation through 



the two objective functions (Fig. 14 b]). It is evident that noisy sampling of a 



phase function corresponding to a point on the perceived front results in an el- 
liptic cloud of points, whose elitist outliers constitute the points of the perceived 
front, as in the model landscapes (see, e.g., Fig. [9]). Also, it is clear that these 
clouds have a dominant horizontal axis in the current scaling. This observation 
suggests that the alignment observable (/i) is sensitive to noise, unlike SHG (/2), 
which is hardly affected by it at the current noise level. Moreover, the shape of 
these clouds seems to be dependent upon the two objective values through a 
multiplicative relation: points with low /i values possess a longer horizontal axis 
and a shorter vertical axis in comparison to points with higher /i values. 

It should be noted that the simulations at higher noise levels obtained rea- 
sonable Pareto fronts in comparison to the noise-free best known front, but their 
reproduction by means of evaluation with the attained Pareto optimal set failed, 
as found on the JVIulti-Sphere model landscape. The simulations also revealed 
that the two procedures with additional parental fitness re-evaluations produced 
Pareto fronts of low quality, as they were typically dominated by the default 
MO-CiVIA procedure. In some cases, however, it is evident that local a posteri- 
ori Pareto fronts of the procedure with occasional parental re-evaluation locally 
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dominated the equivalent fronts of the default procedure. Overall, there is no 
clear superior procedure in this test-case. 

5.4 Laboratory Experiment I: Molecular Ion Generation 

An experimental Pareto front for the molecular ion generation system is de- 



picted in Fig. 15 The shape of the front has been assessed with high confidence, 
based on numerous runs of the single-objective {fj,, A)-CMA-ES on the corre- 
sponding tailored ratio objective function, i.e., ^jj^- We therefore conclude 
that the MO-CMA obtained a perceived Pareto front consistent with the re- 
peated aforementioned single-objective optimization results, but nevertheless, 
its reconstruction by means of the attained Pareto set was not successful, as 
observed with both the Multi-Sphere and the molecular alignment problems. It 
is evident in Fig. [l5] that while the perceived Pareto front dominates the un- 
shaped control reference front, the mean values of the a posteriori sampling of 
the Pareto set produces a dramatically worse front, which is Pareto indifferent 
to the reference front. In addition, the attractive knee point (roughly located 
around coordinate (0.425,0.2)) could not be reconstructed, and its information 
was practically lost. Upon consideration of the experimental data, the perceived 
point appears to be an experimental outlier, which dominated a converging local 
Pareto front in that domain and contributed to its loss. However, it is crucial to 
note that this specific knee area represents a real domain of solutions which has 
been identified in repeated occasions, whose Pareto coverage is much needed. 
Repeating runs by means of alternative strategies introduced an experimental 
overhead, and therefore was not carried out. The second QCE system, Molecu- 
lar Plasma Generation, possesses higher experimental stability, granted by the 
different experimental design. It has therefore been targeted as a platform for 
testing the re-evaluation approach and thus to address the issues revealed with 
the current experimental system. Moreover, it allowed for a comparison between 
various strategies, as will be described in the following section. 



5.5 Laboratory Experiment II: Molecular Plasma Generation 

Taking advantage of the experimental stability of this system, we carried out a 
Pareto optimization campaign by means of the EMOA considered in the current 
study. In particular, we compared the experimental performance of the MO- 
CMA (default and with occasional parental re-evaluation), to the NSGA-II and 
the SMS-EMOA. The observation here is clear, as well as consistent with the 
previous observations on the other systems: The default MO-CMA produced 
highly-satisfying perceived fronts, but suffered from an inability to reproduce 
them upon the termination of the runs. The NSGA-II, on the other hand, per- 
formed poorly, and failed to obtain good approximations to the Pareto front. 
The remaining strategies, MO-CMA with occasional re-evaluation and the SMS- 
EMOA, both performed well - the attained approximate fronts were satisfying. 



and their post-reproduction was successful. Fig. 16 presents successful runs of 
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Fig. 15. Experimental Pareto front of tlie default MO-CMA on the total ionization 
iJion versus SHG problem. The figure depicts the perceived front of a single experiment, 
the reference front of the intensity based non-shaped pulse, as well as a sampling of 
the Pareto set. Inset: single-objective ratio picture. 



both strategies, depicting the perceived fronts, their reproduction, and the un- 
shaped reference fronts (measured upon scanning the amplitude of an unshaped 
pulse). Since the latter represents a trivial reference to pulse shaping, and espe- 
cially to any QC optimization scheme, we argue that the QC optimization pay-off 
in the multi-objective case may be assessed by the calculation of the hypervolume 
ratio between the attained front to the unshaped reference front. Overall, the 
MO-CMA with occasional parental re-evaluation performed best, introducing 
a hypervolume improvement of 24.5% with respect to the unshaped reference. 
The SMS-EMOA, on the other hand, introduced an insignificant improvement 
of merely 3%, due to bad coverage. The success of the occasional re-evaluation 
scheme within the MO-CMA proved to be especially beneficial in this case, 
and thus constitutes an experimental corroboration to the conclusions drawn on 
noisy model landscapes (see Section 5.1). Fig. 17 depicts the evolving hyper- 
volume pay-off of the MO-CMA population - presented as the ratio between 
the raw MO-CMA hypervolume to the hypervolume of the unshaped reference 
front - corresponding to the run presented in Fig. |16[(a)]. For the hypervolume 
calculations, a reference point at [0, 1] is considered. The initial high values of 
the ratio around 0.9 are a consequence of planting seed solutions in the initial 
population. It can be clearly observed that the occasional re-evaluation (every 10 
generations) introduces corrections to fitness disturbances of the parental popu- 
lation that translate into hypervolume declines. In particular, note the dramatic 
decline following the re-evaluation of generation 30 - had not this correction 
occurred, the parental population would have been contaminated by extreme 
outliers and the run would have been affected accordingly. At the same time, 
the re-evaluation scheme does not hamper the general trend of hypervolume 
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Fig. 16. Experimental Pareto fronts for the Molecular Plasma Generation problem 
(maximizing free electron number Jpiaama versus minimizing SHG), for the MO-CM A 
with occasional re-evaluation [LEFT, (a)] and for the SMS-EMOA [RIGHT, (b)]. Each 
figure depicts the perceived front of a single experiment, the reference front of the 
intensity based non-shaped pulse, as well as the reproduction of the Pareto optimal set 
upon the termination of the run. 



increase, and thus offers an efficient solution to the previously reported prob- 
lem. We therefore conclude that this self- correcting property of the occasional 
re-evaluation scheme is essential for experimental scenarios. 



6 Summary 

This paper introduced the topic of Multi-Observable Quantum Control and pro- 
moted its platform as a testbed for evolutionary experimental multi-objective 
optimization. It discussed various practical issues concerning this experimental 
domain, such as the sources of noise and uncertainty, and predominantly consid- 
ered the MO-CMA as the optimization method. Several frameviforks were tar- 
geted for testing - two noisy model landscapes, as well as multiple QC systems: 
one simulated and two experimental. Towards this end, we introduced here a 
family of test-functions, originating from the optical domain of Diffraction Grat- 
ing problems, which can provide model landscapes for Pareto optimization. Their 
attractiveness lies particularly within the simple, yet full mathematical formu- 
lation as well as within the practical linkage to real-world experiments. Overall, 
this effort constituted a broad study of the MO-CMA, subject to fitness distur- 
bance of noisy decision parameters on simulated systems, and its deployment in 
QC laboratory experiments. 

While the MO-CMA excels in Pareto optimization of noise-free model land- 
scapes, it has been observed in the current study that there exists a considerable 
discrepancy between the perceived Pareto front, given as the output by the 
algorithm, compared to the a posteriori evaluation of its pre-images, on both 
model landscapes. We proposed an explanation for this significant deviation, 
stating that the MO-CMA optimally exploits the disturbance distribution and 



34 



Ofer M. Shir et al. 




10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 16 

Generations 



170 180 190 200 



Fig. 17. Tlie evolving liypervolume pay-off of the parental population of the MO- 
CMA with re-evaluation every 10 generations, with respect to the unshaped reference 
front, corresponding to the run depicted in Fig. 16 (a)]. The periodic re-evaluation 
corrects fitness disturbances within the parental population, and causes the occasional 
hypervolume declines. It does not, however, hamper the general trend of hypervolume 
increase in the course of the entire run. For the hypervolume calculations, a reference 
point at [0, 1] was considered. 



converges to the minimal number of search points required to fully span the 
perceived front. As wc demonstrated on the Bi-Sphere case, occasional parental 
fitness re-evaluation improved the MO-CMA performance and thus constituted 
a solution to the problem. 

We set up a comparison between the MO-CMA and two conventional EMOA, 
namely NSGA-II and SMS-EMOA, on the Diffraction Grating test problem. 
While the MO-CMA was the clear winner in low search space dimensions, it 
suffered from slow progress rates in higher-dimensions {n = 80), likely due to 
its self-adaptation mechanism, and required a significant increase in function 
evaluations in order to converge to the true Pareto front. In those cases, the 
SMS-EMOA performed better, and provided a fair approximate front within the 
original budget of function evaluations. 

The application of the MO-CMA to the simulated noisy QC alignment sys- 
tem was successful in terms of revealing the physics conflict between the investi- 
gated objectives, and providing a reliable Pareto front considering the noise-free 
calculations. The quality of the Pareto optimal set was questionable, since the 
perceived front could not be recovered to a satisfactory degree. Concerning the 
reference algorithms, both SMS-EMOA and NSGA-II performed poorly in com- 
parison to the MO-CMA, and failed to cover an important area of the Pareto 
front. The results here constitute an example of a scenario where there is clearly 
no best algorithm for a set of problems, especially when practical experimental 
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requirements, e.g. a fixed budget of function evaluations, are imposed on the 
search. This observation can be considered as a practical interpretation to the 
so-called No Free Lunch (NFL) theorem (see, e.g., [SP). 

The laboratory experiments - the practical climax of this work - allowed 
us to examine the proposed algorithmic framework in real-world experimental 
scenarios. We assessed the conflict between competing objectives for two exper- 
imental quantum systems, and provided interesting Pareto fronts which proved 
to be reliable with high confidence. The first experimental case of molecular ion 
generation considered only the default MO-CMA routine, due to instability and 
laboratory overhead. The Pareto front in this case could not be recovered upon 
evaluation of the Pareto optimal set, consistent with the previous observations of 
this work on model landscapes. The second experimental case on the molecular 
plasma generation system was extensively explored by means of various EMOA, 
and the results led to important practical conclusions. The MO-CMA with occa- 
sional parental re-evaluation performed best, obtained an excellent pay-off with 
respect to the standard unshaped reference, and the reproduction of its attained 
Pareto front was successful. Examination of its evolving hypervolume revealed 
the self-correcting property of the re-evaluation scheme, which overall proved 
to be essential in this experimental scenario. We therefore conclude that the 
MO-CMA with occasional re-evaluation, which introduces a basic yet effective 
extension to an existing EMOA, constitutes a powerful and reliable routine for 
experimental high-dimensional continuous Pareto optimization. 

We would like to propose lines of future work. Given the conclusions drawn 
here, the formulation of algorithmic solutions for the MO-CMA is needed. In 
addition, sensitivity of auxiliary strategy parameters, including a parameter that 
was introduced here (the parental re-evaluation interval) should be investigated. 
In a different direction, future research may also incorporate into multi-objective 
experimental optimization advanced features that have the potential to capture 
various decision making preferences, such as Pareto-compliant indicators [62], or 
the enhancement of decision space diversity |63l64j . 
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A The Diffraction Grating Problem: Analytical 
Expression of the 2-Dimensional Pareto Front 

We consider here a specific instantiation of tlie Diffraction Grating problem, as 
formulated in Eq. 17 with b = 1, h = 4. Let n G N and n> 2, we then define: 

Ji = Jdg (0, = n + 2 • ^ cos [ipi - ipk] 



n>e>k>Q 



32 ^ Jdg (j,^) = n + 2 ■ ^ cos[tt {£ - k) + ipi - fk] 



(25) 



n>£>k>0 



Let: 



D - {{x,y) eR^\3ve[0 27r]" : x ^ j.iip) A y ^ j2{ip)} 



Theorem 1 The Pareto front PF{j) of j{ip) ^ j2(¥'))^ for <^ e [0 27r]" 

is 

PF{3) - {(Ji,j2)^ e [5 n'f \3i+n = n' + 5} (26) 



with 



ifn ^2i, 2 e N 

1 otherwise 



Proof. Let us consider n even (i.e., (5 = 0), the proof for n odd is similar. The 
proof is carried out in two steps: 

- We prove that D d F = {{x,y) e [0 rt^]^ \ x + y < n^} 

— We prove that \/(x,y) e F such that 

x + y = n'^Bip e [0 27r]" 
with x = ji{(p) and ?; = J2(¥'). 
First, notice that: 



fc=0 



, J2(¥') 



k=0 



Hence, Vy- e [0 27r]" ji > and :/2 > 0. 

We start by rewriting the functions ji and j2 in order to eliminate the n factor 
in the cosine arguments of j2- 



Ji(Vo,...,(/3n-i) = n + 2- ^ ^ cos[(pi-(pk] 

k=0 + 1 

n—2 n— 1 

j2(</50, (Pn-l) = n + 2 • ^ ^ COS [7r(i - fc) + V9i - V9fe] 



fc=0 l^k + 1 
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n — 2 n — l — k 

Jl{(pO,...,(pn-l) = n + ^ cos [V3fc+i - V^fe] 

fe=0 1=1 
n—2 n—l—k 

J2(<y30, <^n-l) = n + 2 • ^ ^ COS [ttZ + <^fc+; — (/Pfc] 
fc=0 1=1 

Since n is even and greater than 2, 3m G N n = 2 (to + 1). 

^ m 2m + l-2p m-12m-2p 

-JliVo, ■ ■ ■ = {rn + 1) + ^ COS [i/32p+i - y2p] + cos [i^2p+i + i - V2p+l] 

p=0 1 = 1 p = ( = 1 

m 2m + l-2p m-12m-2p 

-J2(¥'0, ■ ■ • , V'n-l) = (™ + 1) + COS [/tT + l/32p + l - V2p] + ^ ^ COS [(tT + <p2p + l + i " V2P+1] 

P=o / = i p=o ; = i 

(27) 

m — 1 m — p 



-Jl((^0, Vn-l) = (™- + 1) 5Z 5Z '^"^ [V'2p+29+l - <^2p] + X] '^"^ [<^2p+2g - 952P] 

p=0 9=0 p=0 q = l 

m — 1 m — p — 1 m — 1 m — p 

+ ^ ^ COS [(/52P+2+29 - V'2p+l] + 5Z 5Z [<<52p+l+29 - (P2p+l] 

p=0 9=0 p=0 9=1 

^ m m—p m—lm — p 

-02(^0, •fiu-l) = (m + 1) - ^ ^ COS [v72p+29+l - <y52p] + '^"^ [<y52p+29 - tp2p] 

p=0 9 = p=0 9=1 

m m— p — 1 m — 1 m—p 

- ^ ^ COS [(p2p+2+2g - V'2p+l] + X] X] ['/=2p+l+29 " 'P2p+l] 

p=0 9=0 p=0 9=1 

(28) 

Upon considering all the cosines having values of ±1, we may write: 

Dc[0 n^f 

Moreover, we have: 

^ m — 1 m — p m—lm — p 

2O1+J2) = 2(m + l) + 2 ^ ^ cos [(^2p+29 - <^2p] + 2 ^ ^ cos [(^2^+1+29 - 952P+1] 

p— q — 1 p — g— 1 

(29) 

which leads to: 

n+]2<n^ (30) 

Hence, 

D ^ F ^ \{x,y) £[Q n^Y \x + y <n^^ (31) 

In what follows, we shall show that this upper bound is indeed reached: 
Given L = |(ji + ]2), it reaches its global maximum if and only if, Vp G [0 m] 
and I such that 2p + 2Z < n and 2p+l + 2^<n — 1 3fcy , fc^^- G Z such that: 

V2p+2l = V2p + 2fc;p7r (32) 
V2p+l+2i = 'P2p+1 + 2fc;'p7r (33) 



Let us consider satisfying Eq. 32 and Eq. |33j 

Jl(<PO, = ^"^(1 + C0S((/31 - <^o)) 

J2(<y30, (^n-l) = ^"^(1 " COs((^i - (^o)), 
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where cpi — cpQ takes any value in [0 2tt]. Since 9 E [0 27r] cos{9) £ [—1 1] 
is a surjective function, we can conclude that for all {x,y) e [0 n^]^ such that 
X + y = n'^ 3cp € [0 27r]" such that x = ji{ip) and y = j2{v)- This concludes the 
proof. 



B Diffraction Grating: Noise Propagation 

We provide here explicit calculations of the mean and variance for the perceived 
objective function of the Diffraction Grating model landscape, described in Sec- 
tion g] 



B.l Diffraction Grating: Mean 

Consider the intensity function, Idg, presented in Eg. |15[ which may be written 
as 

j ' •^^'^ y'^^'i^i 

(34) 



Idg (C, - ^sinc^ j • J^g (C, 
Jdg (C, = + 2 • ^ cos [C,h {l-k) + Aipik] 



where the compact double-sum notation is used for convenience. Given a dis- 
turbed phase vector, ip, following Eq. [18) 

{ipO+ SifO, + S(pi, . . . , (Pn-l + Stfin^l)'^ (35) 

it thus suffices to investigate the propagation of the noise through Jdg only: 

Jdg (C, ¥>) = + 2 • ^ cos [(h {£ - k) + A^^k] (36) 
e>k 

Note that 

,2\ 



e 



.,..^(0, , ^^^^ 

dipik = Sifi - Sipk - TV (0, 2e^) 

Given the probability density function of the normal distribution, denoted as 
<?(z, /i, CT^), the expectation values of the cosine and sine functions considering a 
distribution with zero mean read: 



/oo / (7 

0, (7^) cos{z)dz = exp I — - 
-oo \ ^ 

<P{z,0,a'^)sm{z)dz = 



(38) 



Eq. [36]can now be rewritten as: 

Jdg = n + 2 ■'^cos[Ch{£ - k) + Aipik + Sipe 



e>k 

, + 2 • ^ cos (fltt) cos (Sifik) - sin (a^k) sin (Sipik) 



(39) 



e>k 
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where aik = Ch {£ ~ k) + Aipik- Upon calculating the expectation values, using 
Eq. |38[ one may write: 



Jdg^ n + 2 • ^ cos (ttik) (cos (Sipik)) - 2 • ^ sin (sin (Sipik)) 

i>k i>k 

= n + 2 ■ cos (a^fc) • exp (— e^) = n + 2 ■ cxp (— e^) • cos (a^fc) , 



concluding with 



Jdg) = n • (1 - exp (-e^)) + exp (-e^) • Jdg 



(40) 



(41) 



The transition to ( Idg / is trivial with Eq 



34 



yielding the result of Eq. 



19 



B.2 Diffraction Grating: Variance 



VAR 



Jdg 



Jdg") - (^JoG^j 



From Eq. 
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Jdg ) is trivial. We now have to compute ( J^^q ) . In order to 



do so, let us first compute the mean of this easier term: 



(cos(ai2fc2) cos(5</?i2fc2) - sin(a;2ft2) sin iSipi2k2)) 



Let (for i = 1,2) 



Cj = cos{aijkj) ,Sj = sm{ai.kj) , c"^ = cos{Sipi^kj) , c"^ = cos{Sipi^kj) 

r[ a a d d . a a d d cy a a d d 

Ill2kik2 — C1C2C1C2 + S1S2S1S2 — ZC1S2C1S2 



J_nG_n\ 

/ ii>fci;2>fc2 



(42) 



We divide the set LK = {(^i, /ci, ^2, ^2) G [0 . 



if /h >kiM2> fca}, to 



which belong (Zi, fci, ^2, ^2), into the six following subsets which form a partition: 



LK = LK., 



indpt 



U LKikik U LKi, U LKi, U LK^k U LK. 



^.kk. 



Consequently, the sum in Eq. 42 may be divided into six sums, and we note: 



'^'^LK, I ^ll^^kik'z + ElA' 1, 1, ^hhl<:ik2 + ElA 



k2 
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with 

LKindpt = Uh,ki,l2, fc2) e [0. . .n - 1]* > ki A I2 > k2 A h ^ I2 A h ^ A ki =^ kz A ki ^ h} 

LKikik = {(li,ki. h.k2) G [0 . . . n - if /h > ki A h > k2 A h = h A ki = fes} 

LKi,i, = {(Ii,ki,l2;k2) e [0 . . .n- 1]^ /h > ki Ah > k2 Ah = h Aki k2} 

LK1..1 = {(ii, fci, l2,k2) e [0 . . .n - 1]^ /h > kx AI2 > k2 Ah = k2} 

LK,k.k = {(h, fcl, fe2) e [0 . . . n - 1]" > fel A i2 > fc2 A fei = fe2 A /i 7^ I2} 

LK_kk. = {{h,kiA2,k2) e [0 . . . n - if /h > ki A I2 > k2 A ki = 12} 

Additionally, we note that 

S(LXn.) = '^{LK.k.k) = 

\n{n - l)(2n - 1) + |(3 - 2n)n{n - 1) + n(n - l)(n - 2) 
^{LKi..i) = ^(LK.kk.) = -in(n - l)(2n - 1) + in(n - if 

ULKi„apt) = =^ - 2n + 3) 



Explicit Summation 

First, consider the following useful results: 

{cos{5(pi)) = exp (^-4) 

{cos{Sipik)} = exp (-e^) 

(sin^Sifiik)) = 

(cos(2(5(y9;fc)) — exp (— 4e'^) 

(cos((5<^,fef ) = 1(1 + exp (-46^)) 

{cos{Sifiik) am{S<fiir)) = 

(sin(<5v5ife)') = I (l-exp(-4e2)) 

(cos((5v'(fc) cos{5ipir)) = I (1 + exp (-2e^)) exp (-e^) 

(sin((5i^;fc) sin((5(p;,.)) = 1 (l — exp (—26^)) exp (— e^) 

We then have: 



LKindpdt 

( E ^!l!2'=l'=2 / = 

E (c^^ ^C2^ + sjsj ^sf^ ^s^^ - 2cjs2 {^c^^ ^s^^ (43) 

= exp ^-2e^) E cos(aijfcj) cos(o!2fe2) 



E 



E ic''f{(c''f) + (s'f{(s'f)-ls'''{s'') (44) 



n(n - 1) exp (-4e^) „ 
= T + 2^ cos(2aife) 

l>k 
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-^'l'2'=l' 



c°C2 {cos{S<pik) cos{Sipir)) + ^2 {sin{5 if i k) sin(5<p i^)) 
— 2c"s2 {cos{Sipik) sin{Sipir)) 
= -cxp( — e^^ ^ cos{aik — air) + cxp ( — 2e^^ cos{aik + air) 



LKi 



I.. I 



fc2 



CjCj (C0s((5<p;fc) COs((5l^3i)) + S"S2 (sin( Sl^i ) Sill ((Sip^ ; ) ) 

— 2c"s2 (cos((5(^ifc) sin(i5<^si)) 
= - oxp T — e^') y^ cos(aifc + ajr) + oxp T — 2e^^ cos(aifc — air) 



.fc.fc 



E '^"'^2 (cos(5v!fc) cos((5<p„fc)) + s"s2 (sinCSt^ifc) siii((5(/)3fc)) 
'K,k.k 

-2c1sl {cos{Sipik)sin(Sipsk)) 
= \cx-p(-e^^ cos(a,fc - Oafc) + cxp ('-2is^') cos(a,t + a^fc) 



LK 



.kk. 



c"c2 (cos(i5ipifc) cos(i5ipts)) + s"s2 (sin(i5(^ifc) sin((5ij3fcs )) 



-2c"s2 (cos(i5i^!fe) sin((5(^fcs)) 

cxp (^-£^^ y^ cos(a;fc + Ofcs) + cxp (^-2c^^ cos(a,fc - Ofca 



Conclusion 

From Eqs. |43l |44) |45) |46) |47) |48] we may write: 

n(n — 1) cxp {— 4e^) 



2 



H 2^ cos(2a;fc) + cxp ( -2e j cos{aik) 



+ -cxp^-e 



cos(aifc - a^s) + cxp (-2£^^ cos(an, + a^a) 



+ ^ ^ cos(a;fc + a^s) + exp ^— 2e^^ cos(a/fc — a^^) 

L'<l_,lULK^kk. 
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concluding with: 



VAR [Jdg] = ^ 1) (l ^ ™P (^^'^)) - 2cxp (-2e^) (l - cxp (-2e^)) ^ cos(2a(fc) 



+2oxp (-c^) (l - cxp (-<!^)) 
-2cxp (-2e^) (l - cxp (-«^)) 



cos(a(fc - a,,s) + ^ cos(aifc + a,,s) 



cos(aifc + ttrs) + y^ cos(a(fc — 0,-3) 

LK, lULK j,^ 



An upper bound on the variance is given by: 

VAR pDoj < n(n - 1) [(l - exp (-4e^)) + 2exp (-e^) (l - exp (-2e^)) (n - 2) 
For a smah e, the bound may be tightened: 



(50) 



VAR 



Jdg 



2,2 



< An{n - 1) e 



(51) 
(52) 



Note that in order to obtain Eq.[52], aU the cosine terms had to be majored by 1. 
Given a point on the screen with destructive interference (the sum and products 
of the cosines vanish), the upper bound in Eq. 52 is strongly superior with 
respect to the actual variance. On the other hand, given a point with constructive 
interference, the upper bound is a fair estimation of the real variance. Notice also 
that the upper bound of the variance is proportional to the cube of the dimension 
and the variance of the stochastic noise. 

Finally, the transition to I^g is obtained: 



VAR 



DG 



1 



VAR 



Jdg 



(53) 
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